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(54) Nucleic acid sequences encoding citrate transporter proteins 

(57) The invention relates to nucleic acids that 
encode a secondary transporter that transports citrate 
or the complex of citrate and metal ions or metal salt 
ions. These nucleic acids can be functionally expressed 
in host cells such as E. co//.and Bacillus, or other host 
cells. The transporter proteins encoded by the nucleic 
acids of the invention and host cells or biological mem- 
branes comprising the proteins facilitate and enhance 
the removal of heavy metals from compositions such as 
waste, waste disposal sites, metal ores. 
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Description 



Heavy metals are becoming an increasingly serious problem to the environment. Their toxicity and the fact that they 
tend to get diluted or dissolved and get distributed from out of waste deposits make them a major environmental hazard, 

5 whereas at the same time those same metals compose a precious commodity which would make it worthwhile to har- 
vest these metals from the environment. Environmental contamination with metals occurs for example in waste, or in 
ground or process water stemming from mining, the metal industry, viscose industry, rubber industry, paper industry, 
potato industry, starch processing, yeast industry, and so on. In addition, mining for metals in itself is a process where 
metals are harvested from the environment, and all above methods entail processes in which metals are recovered. 
10 Bacterial or biological technology, using selected naturally occuring microorganisms that can help concentrate and 
accumulate heavy metals, is currently in development. Sulphate reducing bacterial strains have been detected in nature 
which specifically convert sulphate into sulphite (which takes place under oxygen-free, anaerobic conditions). The sul- 
phite than reacts with dissolved metals, producing a metal precipitate which can be removed. Such processes can be 
carried out in reactors. Other biological processes for recovering metals make use of bacteria that reduce the heavy 
is metal itself and create metal precipitates, however, the above described biological processes are basically performed 
in an oxygen-free environment under anaerobic conditions, conditions which are not easy to achieve when large 
masses of waste or material containing dilute concentrations of the sought-after metal need to be handled. 

Citrate is very abundant in nature and most bacteria have transport proteins in the cytoplasmic membrane that 
mediate the uptake of citrate. The carriers belong to the class of secondary transporters that use the free energy stored 

20 in transmembrane electrochemical gradients of ions to drive the uptake of the substrates (for a review, see 1 8). The cit- 
rate transporter CrtH of Klebsiella pneumoniae is driven by the proton motive force (23) and the transporters CitS and 
CitC of K pneumoniae and Salmonella serovars by both the proton motive force and sodium ion motive force 
(9,14,19,24). Mechanistically these transporters catalyze coupled translocation of citrate and H 4 and/or Na + (synrtport). 
A special case are the citrate carriers of lactic acid bacteria that take up citrate by an electrogenic untport mechanism 

25 or by exchange with lactate, a product of citrate metabolism (citrolactic fermentation) (16,17,20). These citrate trans- 
porters are involved in secondary metabolic energy generation (12). 

A number of structural genes coding for citrate transporters have been cloned and the primary sequences have 
been deduced from the base sequences. The proton dependent citrate carrier of K. pneumoniae CrtH belongs to a 
large family of proteins in which also many sugar transporters are found (22). The Na + dependent citrate carriers CitS 

30 of K. pneumoniae and CitC of S. serovars form together with the citrate carriers of lactic acid bacteria CitP's a distinct 
family termed the 2-hydroxy-carboxylate carriers (4,15,24). The malate transporter of Lactococcus lactis MleP that is 
involved in malolactic fermentation is also a member of this family (1 ). 

Citrate is a chelator that forms stable complexes with various metai ions or metalsalt ions such as but not limited to 
Mn 2+ , 2n 2+ , Mg 2 *, Be 2+ , Ba 2+ . Ca 2+ , Cu 24 . Co 2 *. Fe 2 *. Fe 3+ , Pb 2+ , Cd 2+ , UOg 2 *. and Ni 2+ . The presence of metal ions 

35 resuits in inhibition of citrate transport activity by the transporters mentioned above (16,23,24) showing that the metal 
ion/citrate complex is not a substrate of these citrate transporters. On the other hand, other bacteria including 
Pseudomonas and Klebsiella spp and Bacillus subtilis are known to preferentially take up and degrade citrate in the 
metal ion bound complex (2,3,10,26). 

These microorganisms have been implicated in the prevention of mobilization of toxic metal wastes by chelators 

40 like citrate. Degradation of the metal ion/citrate complex would render the metal ion in an insoluble, immobilized state 
(7). A complication is that the nature of the metal ion in the complex determines whether or not the complex is degraded. 
Studies with Pseudomonas fluorescens have shown that at least for a number of metal ions the lack of degradation was 
limited by the lack of transport of the complex into the cells and not because of the toxicity of the metal ion. The trans- 
porter seemed to recognize only the bidentate metal ion/citrate complexes that leave the hydroxyl group of citrate free; 

45 and not tridentate complexes (10). 

As an example, the citrate carrier or citrate transporter protein of B. subtilis transports citrate in a complex with a 
wide variety of metal ions. Studies with membrane vesicles showed that the highest uptake rates were observed with 
Mn 2+ , intermediate rates with 2n 2+ , Mg 2 *, Be 2 *. Ba 2+ , Ca 2+ and Cu 2+ and the lowest rates with Co 2+ and Ni 2+ (2). It 
can however be expected that other citrate carrier have other metal specificities. 

so The present invention now provides a nucleic acid and derivatives thereof encoding genes of a new family of sec- 
ondary transporter proteins. A first gene, termed CrtM. was identified in Bacillus subtilis. Functional expression in 
Escherichia coli showed this gene to encode a citrate transporter protein that preferentially transports a metal-citrate 
complex. The invention now thus provides a gene that can be introduced in any bacterial or biological host cell to be 
used in bacterial or biological processing or recovery of metal. Such host cells can for example be selected from any of 

55 the bacterial strains from the genera Pseudomonas, Klebsiella, or Bacillus, but many other bacteria or other host cells 
can also be used. One may for example select those strains that thrive well in the presence of metal or metalsalt ions 
of various nature. A second gene, termed CitH, was identified in Bacillus subtilis, by searching of available databases 
for protein sequences resembling the CrtM gene. The invention thus reveals a further citrate transporter. 
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The invention thus provides genes that code for a new family of secondary transporters. Studying the topology of 
the proteins encoded by these genes reveals a protein topology consisting of 12 transmembrane segments interspaced 
by a total of 1 1 loops. Within these transmembrane fragments and loops topology requirements and active sites are 
present which, when modified via recombinant technology, will alter the metal specificity of the encoded citrate trans- 

5 porter protein. Thus modified transporter proteins have either a narrower or broader metal ion specificity in the metal 
ion/citrate complex that is recognized by the carrier. The present invention encompasses this possibility of generating 
a nucleic acid sequence encoding a modified citrate transporter protein via the further application of recombinant DNA 
technology Such modifications may be single or multiple mutations, substitutions, deletions or insertions or combina- 
tions thereof that can be achieved via any recombinant DNA technology methods known in the art. The present inven- 

ro tion makes it possible to modify citrate transporter proteins that can be used to specifically interact with a particular 
metal or metals or a salt thereof, and can be used in processes recovering metals. Such taylor made proteins can be 
used in vivo, as being present as the active component in bacteria used for the recovery of metals from industrial waste 
and the like, but can also be used in vitro as active component in for example artificial biomembranes that will be used 
for metal recovery. 

is The invention also provides methods to select microorganisms that comprise said genes of a new family of second- 
ary transporter proteins. Such selection methods can for example be based on a wide array of nucleic acid amplif ication 
techniques that is now available to the art. Such microorganisms can than advantagously be used for the industrial 
recovery of metal. 

20 Experimental part 

MATERIALS AND METHODS 

Bacterial strains and growth conditions. 

25 

E. coli strain JM101 harboring plasmid pWSKcrtM coding for the divalent cation dependent citrate carrier of B. 
subtifis and strain JM109(DE3) harboring plasmid pWSKcitH coding for the proton dependent citrate carrier of B. 
subtilis were grown in 6 1 flasks containing 1 I medium at 37 °C and under vigorous shaking. JM101 was grown in LB 
medium, JM109(DE3) in LB medium or minimal medium containing citrate as the sole carbon source. Carbeniciilin was 
30 added at a concentration of 100 ug/ml and isopropyl-p-D-thiogalactopyranoside (IPTG) was added when appropriate. 
The cells were harvested in the late exponential growth phase and used immediately for uptake experiment or the prep- 
aration of membrane vesicles. B. subtifis strain 6GM was grown at 37 °C with vigorous aeration in medium containing 
0.8% trypton (Drfco). 0.5% NaCI, 25 mM KCI and 10 mM Na-citrate. 

35 Cloning and sequencing of CitM. 

Chromosomal and plasmid DNA isolations and all other genetic techniques were done using the standard protocols 
described by Sambrook et al. (21) or the manufacturers protocols. Chromosomal DNA isolated from B. subtilis 6GM 
was partially digested with the restriction enzyme HindlW and the fragments were ligated in the multiple cloning site of 

40 vector pINIHA (8) restricted with the same enzyme. Two fragments of 0.9 kb and 1 .8 kb were restricted from clone pM5 
by Hind\\\ and ligated in the multiple cloning site of plasmid pBluescript II SK (Stratagene) yielding pSK0.9 and pSK1.8. 
Sets of nested deletions starting at both ends of the inserts were constructed of pSK0.9 and pSK1 .8 using the Erase- 
a-base System (Promega). The plasmids were digested with Kpn\ or Sac\ to create protected 3' overhangs and Sa/I or 
BamH\ to allow digestion into the fragments. The subclones were sequenced on a Vistra 725 automated sequenc r 

45 using Texas Red labeled forward and reverse primers of the pBluescript vector (Fig. 1 ). The sequencing reactions were 
performed using the Thermo Sequenase labelled primer cycle sequencing kit (Vistra Systems) with 7-deaza-dGTP 
according to the manufacturers protocol. 

Southern blotting. 

50 

DNA probes were prepared by amplifying the regions on genomic DNA of S. subtilis that code for CitM and open 
reading frame N15CR by PCR. The CitM probe constitutes approximately the first 487 nucleotides of citM and the 
N15CR probe the last 463 nucleotides of N15CR. The probes were labeled with digoxygenin by including DIG-dUTP in 
the PCR reaction. The reaction contained in a total volume of 30 \x\, 3 \x\ superTaq buffer. 10 ng template DNA, 6 nl of 
55 a mixture of dATP, dCTR dGTP (0.65 mM each) and DIG-dUTP 0.35 mM, 2.5 U superTaq and 1 ul gelatine (5 mg/ml 
stock). The ligonucleotide primers were used at a concentration of 0.03 ng/ul. The forward and backward primers for 
the CitM probe were 
5'- TTAAGGGGCCATGGA 
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TGTGTTAGC-3' and 5'-CTCCCCAAGGAATCGTGTTC-3* and for the N15CR probe 5*-GG 
TGGATGCAATGGCGCATTC-3 , and 5'-ATGMTCTTCTAGACACTCATAG-GATCCTATT 

GATC-3*. The PCR reactions yielded fragments of the expected size and control reactions in the absence of DIG-UTP 
yielded identical fragments. Restriction analysis confirmed that the correct regions had been amplified. The DIG-dUTP 
5 labeled products were purified over an QiaQuick column (Qiagen) and approximately 1 fig of labeled probe was used 
per hybridization. 

DNA was electrophoresed in agarose gels and blotted on Zeta-Probe Blotting Membrane (Biorad) using the Vacuum 
Blotting Unit (LKB Bromma). Sample preparation and transfer was essentially performed as described by Sambrook et 
al. (21). The blots were incubated with the labeled probes overnight at 65 °C. Subsequently, the membranes were 
70 washed twice with 300 mM NaCI, 30 mM Na-crtrate, 0.1 % SDS for 5 min and at room temperature and twice with 15 
mM NaCI, 1.5 mM Na-citrate, 0.1% SDS for 15 min at 55 °C. The membranes were used immediately for detection of 
hybridization. 

Construction of the expression vectors. 

15 

pWSKcitM. The citM gene was amplified by PCR using the Vent polymerase (Biolabs) from chromosomal DNA iso- 
lated from 0. subtilis 6GM. The forward primer 5'-TTAAGGGGCCAT GGATGTGTTAGC-3" contained the putative ribos- 
omal binding site (indicated in italics) and the valine start codon (bold) and two mutations that result in a Nco\ restriction 
site (CCATGG) in front of the start codon. The Nco\ site was engineered for future purposes. The backward primer 5'- 
20 GTCATTACGCCTGAA7TCCTCATACG-3' contained two mutations that create an £coRI site (italics) immediately 
behind the TGA stop codon (bold). The EcoRI site at the 3' end of the PCR product was cut while the 5' end was left 
blunt. The fragment was ligated into plasmtd pWSK29 (25) digested with Sma\ and EcoRI yielding pfasmtd pWSKcitM. 
In pWSKcitM, the open reading frame coding for CitM is downstream of the lac promoter on the vector and the B. 
subtilis ribosomal binding site. 

25 pWSKcitH. Open reading frame N1 5CR coding for CitH was amplified from the same chromosomal DNA The forward 
primerS'- AAAAAAGC77TTGAATAGGGGAGGTCATA 

CC ATGGTTGCCATAC-3' contained three mutations resulting in a Hind\\\ site in front of the ribosomal binding site and 
a Nco\ site around the start codon. The construction of the Nco\ site results in the Leu2Val mutation in the primary 
sequence The backward primer 5'- ATG AATCT 7~C TAG A C ACTC ATAG- GA7CCTATTG ATC-3' was complementary to 
30 sequences downstream of the stop codon. Four base changes resulted in the introduction of BamH\ and Xba\ sites 
(italic) in the PCR product. The PCR product was digested with Hind\\\ and BamH\ and ligated into plasmid pWSK29 
(27) digested with the same two enzymes. In the resulting vector pWSKcitH the citH gene is downstream of the T7 pro- 
moter and the B. subtilis ribosomal binding site. The base sequences of the inserts in pWSKcHM and pWSKcitH were 
verified by sequencing the sense strand. 

35 

Transport assays. 

Whole cells. Cells of E. coli harboring plasmids pWSKcitM and pWSKcitH were grown in LB broth as described 
above and washed twice in 50 potassium phosphate pH 7. Uptake of [1,5- 14 C]crtrate was measured essentially as 

40 described by Lolkema et al. (14). The cells were resuspended in 95 fil of the same buffer with the indicated additions 
and incubated for 1 0 min at 37 °C. At time zero 5 \i\ [1 ,5- 14 C]citrate was added to give a final concentration of 4.5 jiM. 
Uptake was stopped by adding 2 ml of an ice cold 100 mM LiCI solution followed by immediate filtering over 0.45 um 
nitrocellulose filters. The filters were washed twice with the LiCI solution and immediately submerged in scintillation fluid 
to stop any further metabolic activity. The radioactivity retained on the filter was quantified in a liquid scintillation coun- 

45 ter. 

Membrane vesicles. Right-side-out (RSO) membrane vesicles were prepared of the E. coli cells harboring plasmids 
pWSKcitM and pWSKcitH by the osmotic lysis procedure as described by Kaback (1 1). E. coli JM1 01 /pWSKcitM was 
grown in LB medium and JM109/pWSKcitH in minimal medium supplemented with 20 mM Na-citrate. The membranes 
were resuspended in 50 mM potassium 1,4-piperazindiethansuffate (Pipes) pH 6.5 at a protein concentration of 15 

so mg/ml and stored in aliquots in liquid nitrogen. The membrane vesicles were energized by the K-ascorbate/phenazin 
methosuffate (PMS) electron donor system. The membranes were diluted in 50 mM K-Pipes pH 6.5. and 10 mM K- 
ascorbate in a total volume of 100 ul and incubated for 5 min at 30 °C under a constant flow of water saturated air. PMS 
was added at a concentration of 100 jiM and the proton motive force was allowed to develop for 1 min, after which [1 ,5- 
14 C]citrate was added to a final concentration of 4.5 \M. The uptake was stopped and the samples were processed as 

55 described above. 

Materials. [1 ,5- 14 C)crtrate (111 mCi/mmol) was obtained from Amersham Radiochemical Center. Mono potassium 
phosphate and potassium hydroxide with low Na + content were obtained from Merck. All other chemicals were reagent 
grade and obtained from commercial sources. 
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Data bank submission. The CitM base sequence has been submitted to the NCBI gene bank and will be acces- 
sible under number U62003. 

RESULTS 

5 

Cloning and sequencing of CitM. The Mg 2+ -dependent citrate carrier of a subtilis that we will term CitM was 
cloned using conventional techniques. Chromosomal DNA of B. subtilis 6GM was partially digested with HindlW and the 
fragments where cloned in the expression vector pINIIIA (8). Escherichia coii is an ideal host tor the cloning of citrate 
carriers since it is not capable of taking up citrate but metabolizes it readily in the citric acid cycle. The expression vector 

10 containing the chromosomal fragments were transformed to E. coliM 101 and grown on citrate indicator plates (Sim- 
mons agar). A number of blue colonies, indicative of citrate uptake and metabolism, were assayed for their ability to take 
up citrate in the presence and absence of 10 mM MgCI 2 . A clone (plasmid pM5) was selected that showed no uptake 
activity in the absence of Mg 2+ and a high uptake in the presence of Mg 2+ (not shown). Restriction analysis of pM5 
repealed the presence of a 5.6 kb insert containing 3 additional Hind\\\ sites. A 2.8 kbase EcoRI fragment of pM5 was 

is subcloned into the EcoRI site of pINIIIA. Only one of the two orientations of the insert (plasmid pM6) showed Mg 2 *- 
dependent citrate uptake activity revealing the presence of the gene on the fragment and showing that the gen is 
expressed from the tandem promoter on pINIIIA. Further subcloning of pM6 resulted in the loss of the citrate utilizing 
phenotype. pM6 was used to subclone the 0.9 kb EcoRI/H/ndlll fragment (pSK0.9) and the 1 .8 kb Hindlll- EcoRI frag- 
ment (pSK1 .8) in pBluescript. The two subclones were used to make sets of nested deletions to determine the nude- 

20 otide sequence of the gene. Both subclones were sequenced in both directions. Reconstruction of the base sequence 
of the insert on pM6 from the sequences of the two subclones revealed an open reading frame downstream of the pro- 
moter region on the vector with a length of 1 302 bp and starting with a GTG codon (Fig 1). The length of the ORF con- 
forms to the expected length of a gene coding for a bacterial secondary transporter. A putative ribosomal binding sit 
is located upstream of the GTG start codon that shows extensive similarity to the 3' end of B. subtilis 1 6S rRN A. Neither 

25 a clear promoter sequence nor a terminator sequence were found upstream and downstream of the ORF, respectively. 
The complete base sequence has been deposited in the NCBI gene bank and will be available under accession number 
U63002. 

A non redundant search of the available gene banks revealed an ORF of 1278 bp with 60% of base identity with 
the cloned gene. The ORF indicated as N15CR was detected in the bglS-katB intergenic region on the genome of 8. 

30 subtilis 1 68. The ORF (Fig 2) starts with an ATG codon and is preceded by a ribosomal binding site. No clear promoter 
region could be detected upstream. The alignment (Fig 3) of the CitM sequence with the N15CR ORF shows that the 
ATG codon that lies in between the ribosomal binding site and the putative CitM GTG start codon is unlikely to function 
as the initiator of translation (Fig. 2). 

The presence of the citM gene and the N15CR ORF on the genome of B. subtilis 6GM was confirmed by PCR and 

35 Southern hybridization. DNA probes were prepared by amplifying the first 487 bp of citM and the last 463 bp of N15CR 
by PCR using chromosomal DNA of B. subtilis 6GM as the template. The probes were selected such that they con- 
tained no Hind\\\ restriction sites. The PCR resulted in distinct DNA fragments of the expected length. The nucleotide 
analog DIG-dUTP was incorporated into the fragments for the use as probes in Southern blotting. The two probes 
detected unique, but different fragments of 8. subtilis 6GM genomic DNA restricted with HindUl Both the fragments 

40 were of the expected length. Plasmids pWSKcitM and pWSKcitH that contain the citM gene and ORF N15CR (see 
below) hybridized exclusively with the citM and N13CR probes, respectively, in spite of the high sequence identity 
between the two genes. The lack of cross-reaction reflects the high stringency of the hybridization and washing condi- 
tions. Under these conditions the two probes did not detect similar genes on the chromosome of E. coli and the ther- 
mophilic Bacillus species B. stearothermophilus. 

45 Primary sequence analysis. Translation of the base sequences of the cloned ORF and the corresponding ORF 
from the B. subtilis gene bank results in two proteins with approximately 60 % amino acid sequence identity and an 
additional 18 % of similar residues. The amino acid composition of the two proteins is typical for integral membrane pro- 
teins with an average hydrophobicity of 0.51 and 0.47 on the normalized scale of Kyte (5), respectively. The hydropathy 
profiles of the two sequences are remarkably similar. Significant differences show up only in the region from position 

so 125 to 145 and to a lesser extent in the region from position 310 to 330. In both regions, CitM is the more hydroph bic 
sequence. The high similarity of the two sequences suggests a similar folding in the membrane. Secondary structur 
prediction (6) results in 1 2 membrane spanning, presumably a-helical, regions both for the CitM protein and the protein 
coded by the N15CR ORF. Assuming similar folds for the two proteins, merging of the two predictions results in 12 trans- 
membrane segments, interspaced by 11 loops. The respective nucleic acid sequences corresponding to the 12 trans- 

55 membrane segements and the 1 1 loops relate more or less to the sequence as shown in figure 1 from position 7 to 71 , 
or from 72 to 77. or from 78 to 144, or from 145 to 182, or from 183 to 243, or from 244 to 272, or from 273 to 338. or 
from 339 to 398. or from 399 to 458, or from 459 to 536, or from 537 to 596. or from 597 to 704, or from 705 to 773, or 
from 774 to 848. or from 849 to 923. or from 924 to 965, or from 966 to 1037, or from 1038 to 1055. or from 1056 to 
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1109. or from 11 10 to 1148, or from 1149 to 1202, or from 1203 to 1228, or from 1229 to 1316, respectively. It can be 
expected that within these transmembrane fragments and bops lay topology requirements and active sites which, when 
modified via recombinant technology, will alter the metal specificity of the encoded citrate transporter protein. 

Substrate specificity. The citM gene and the N15CR ORF (CitH gene) were amplified by PCR using B. subtilis 

s chromosomal DNA as template. The citM fragment was cloned downstream of the lac promoter region on plasmid 
pWSK29 a low copy pBluescript derivative (25) yielding plasmid pSKcrtM. The N1 5CR ORF was cloned behind the T7 
promoter on the same plasmid yielding pWSKcitH. The sequence of the cloned PCR fragments was verified in one 
direction and found to be identical to the base sequence of the original ORFs except for the second codon of citH which 
now codes Val instead of Leu (see the Methods section for details). 

w Plasmid pWSKcitM was transformed to £ coli JM101 and plasmid pWSKcitH to E.cofi JM109(DE3) a strain that con- 
tains a chromosomal copy of 17 polymerase and the cells were plated on Simmons agar indicator plates. Surprisingly, 
both plasmids conferred the citrate utilizing phenotype. Apparently, the N15CR ORF codes for a citrate transporter as 
well. We will term this transporter CitH. Figure 4 shows the uptake of citrate in cells harboring plasmids pWSKcrtM (A) 
and pWSKcitH (B) in the presence of different concentrations of Mg 2+ . Citrate uptake activity in cells expressing CitM 

is is completely absent in the absence of Mg 2+ -ions. The uptake activity increases with increasing Mg 2 * concentrations 
consistent with the Mg 2+ -citrate complex being the substrate of the carrier (2). In marked contrast, cells harbouring plas- 
mid pWSKCitH expressing the citrate carrier coded by ORF N15CR (citH) readily take up citrate in the absence of Mg 2 *. 
Increasing concentrations of Mg 2+ in the assay buffer result in decreasing uptake rates. Apparently, the substrate of 
CitH is free citrate as is the case for the Na + and H + -dependent citrate carriers of K. pneumoniae (23,24) and the mem- 

20 brane potential generating citrate carrier of L mesenteroides (1 6). 

Co-ton specificity. The involvement of Na + ions in the uptake of citrate by CitM and CitH was investigated by 
measuring the uptake of citrate in £ coli strains JM101/pWSKcrtM and JM109(DE3)/pWSKcrtH in the presence and 
absence of 10 mM NaCI. Prior to the experiments the cells were washed three times in large volumes of potassium 
phosphate pH 7 containing especially low amounts of Na + . 

25 The residual Na + ion concentration was at most a few u.M. Furthermore, the uptake experiments were performed 
in plastic tubes to prevent Na + contamination from glassware. For both transporters the uptake of citrate was not signif- 
icantly different in the presence or absence of NaCI indicating that Na + is not a co-ion for CitM nor CitH (not shown). 
Studies with membrane vesicles prepared from B. subtilis cells have demonstrated that the Mg 2+ dependent citrate 
transporter CitM is a secondary transporter that is driven by the proton motive force (2). The high similarity between 

30 CitM and CitH suggests that the same is true for CitH. The energy coupling mechanism of both cloned transporters was 
investigated by preparing right-side-out membrane vesicles of £ coli cells expressing CitM or CitH. The membranes 
were energized by the ascorbate/PMS electron donor system. In the presence of a proton motive force, both transport- 
ers accumulated citrate indicating cotransport by CitM of the Mg 2+ /citrate complex and protons and by CitH of citrate 
and protons. In the presence of the K + ionophore valinomycin which results in the dissipation of the membrane potential 

35 component of the proton motive force the uptake activity was slightly less .In the presence of nigericin, a K + /H + anti- 
porter, which dissipates the pH gradient across the membrane and results in a proton motive force that is solely com- 
posed of the membrane potential, significant uptake above background is still observed. It is concluded that both CitM 
and CitH are electrogenic transporters that translocate net positive charge into the cells. The complex between citrate 
and Mg 2 * is monovalent anionic (MgCif) and, therefore, CitM cotransports at least 2 protons per Mg 2+ /cttrate complex. 

40 Electrogenic transport by CitH indicates cotransport of at least 3 or 4 protons depending on translocation of Hcit 2 or 
cit 3 ". respectively. 

DISCUSSION 

45 The cloning and sequencing of the Mg 2 * dependent citrate carrier of B. subtilis CitM let to the surprising finding of 
a second citrate carrier in 0. subtilis CitH for which the gene was deposited in the databanks as open reading frame 
N15CR. The two transporters share common properties at different levels: (i) the coding genes are 61 .5% homologous 
and in the primary sequence alignment about 60% residues are identical, (ii) the transporters function as electrogenic 
proton symporters and (iii) the genes coding for the transporters are present on the chromosome of B. subtilis and th 

so genes were not found in £ coli and B. stearothermophilus. The most striking difference between the two transporters 
is that CitM transports the Mg 2+ /citrate complex while CitH transports free citrate in the uncompleted state and is ham- 
pered by the presence of Mg 2 . In B. subtilis, CitM is induced by citrate in the medium and the absence of citrate uptake 
by membrane vesicle in the presence of EDTA indicates that CitM is the only transporter induced under these condi- 
tions (2). The experiments with the cloned transporter in £ coli emphasize that CitM transports only citrate in the Mg 2+ 

55 complexed form. Therefore. CitH is not induced under th same conditions in B. subtilis. Open reading frame N15CR 
that codes for CitH lies in between the genes bgIS and katB on the B. subtilis genome. It is coded in the opposite direc- 
tion relative to these two genes. Gene bgIS codes for lichanase, an exported enzyme, that hydrolyses mixed linked p- 
glucans (27) and katB codes for a catalase involved in sporulation (13). The citH gene is not preceded by a known pro- 
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moter sequence and does not seem to be part of a operon structure suggesting that the gene is silent. On the other 
hand, the gene is preceded by a ribosomal binding site and results in a functional transporter when expressed from a 
heterologous promoter. Either the gene has become silent only very recently on a evolutionary time scale or, more 
likely, the gene is expressed under special, unknown conditions using an unknown promoter sequence. 
The high sequence derrtity of CitM and CHH suggests that the binding sites for the Mg z+ /citrate complex and citrate are 
not very different The two transporters may be very suitable for the construction of chimeric proteins to localize the sub- 
strate binding site in the primary sequence. The successful construction of active chimeras can be tested on citrate indi- 
cator plates and the Mg 2 *-dependency provides an easy way to discriminate between the activity of the two 
transporters. We constructed one chimera by making use of a conserved Stul restriction site in the two genes around 
position 490. The hybrid protein consisted of the N-terminal CrtM fragment and the C-terminal CitH fragment. We are 
now in the process of constructing a series of chimeras by introducing unique restriction sites at selected sites in the 
citM and otH genes 
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Figure legends 
20 Figure 1 

Nucleotide sequence and corresponding amino acide sequence of the CrtM gene. 
Figure 2 

Nucleotide sequence and corresponding amino acide sequence of the CitH gene. 

25 

Figure 3 

Alignment of amino acid sequences of the CrtM and CitH genes. 
Figure 4 

30 Mg 2+ -ion dependence of the uptake activity of CitM (A) and CitH (B). [1 ,5 14 C]citrate uptake by £ co// JM101/pSK- 
crtM (A) and £ coli JM109(DE3)/pSKcitH (B) was measured in 50 mM potassium phosphate pH 7 supplemented 
with 0 (O). 0.5 (o), 1 (v), 5 (A) and 10 (□) mM MgCI 2 . The celt protein concentrations were 0.6 (A) and 1.2 (B) 
mg/ml. 
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SEQUENCE LISTING 



(i) APPLICANT: 

(A) NAME: Ri j ksuniversitei t Groningen 

(B) STREET: Broerstraat 5 

(C) CITY: Groningen 

(D) STATE: Groningen 

(E) COUNTRY: the Netherlands 

(F) POSTAL CODE (ZIP) : 9712 CP 

(ii) title OF INVENTION: Nucleic acid sequences encoding citrate 
transporter proteins. 

(iii) NUMBER OF SEQUENCES: 11 

(iv) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 
(Bi COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC -DOS /MS - DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.30 (EPO) 

(v) CURRENT APPLICATION DATA: 

APPLICATION NUMBER: EP 96203015.1 

(2) INFORMATION FOR SEQ ID NO: 1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 
TTAAGGGGCC ATGGATGTGT TAGC 
(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) length: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: other nucleic acid 
(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 
CTCCCCAAGG AATCGTGTTC 
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.2! INFORMATION FOR SEQ ID NO: 3: 

i) SCQUENCE CHARACTERISTICS: 
!A) LENGTH: 36 base pairs 
»B) TYPE: nucleic acid 
•C) STRANDEDNESS: unknown 
ID) TOPOLOGY: unknown 

tii) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL: NO 



Oil SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

ATGAATCTTC TAGACACTCA TAGGATCCTA TTGATC 

(2i INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 26 base pairs 
(B> TYPE: nucleic acid 
tC) STRANDEDNESS: unknown 
<D> TOgQLOGY unknown 

Ui) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL: NO 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

GTCATTACGC CTGAATTCCT CATACG 

(2) INFORMATION FOR SEQ ID NO: S: 

U> SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 44 base pairs 
(8) TYPE: nucleic acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: other nucleic acid 
(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

AAAAAAGCTT TTGAATAGGG GAGGTCATAC CATGGTTGCC ATAC 

(2) INFORMATION ."OK SSQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1314 base pairs 
(a) type: nucleic acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 
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(ii) MOLECULE TYPE: other nucleic acid 
(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

GGGGGAATGG ATGTGTTAGC AATCTTAGGC TTTCTCATGA TGCTTGTGTT TATGGCATTG 60 

AT CATGACAA AACGGCTTTC TGTTTTAACA GCATTAGTTT TGACGCCGAT TGTGTTTGCG 120 

CTTATCGCCG GATTTGGATT TACTGAAGTT GGGGACATGA TGATTTCGGG GATTCAGCAA 180 

GTCGCGCCGA CTGCGGTCAT GATTATGTTT GCGATCTTAT ATTTTGGAAT TATGATTGAT 240 

15 

ACAGGCCTGT TTGATCCAAT GGTTGGCAAA ATTTTAAGCA TGGTCAAAGG AGACCCTTTA 300 

AAAATTGTTG TCGGGACAGC GGTTCTTACA ATGCTCGTCG CCTTGGACGG AGATGGCTCG 360 

ACAACGTACA TGATTACGAC AAGCGCCATG CTTCCGCTCT ATTTGCTGCT AGGCATCCGG 420 

20 CCAATTATCT TGGCAGGAAT CGCGGGAGTC GGCATGGGAA TCATGAACAC GATTCCATGG 460 

GGAGGTGCGA CGCCGAGGGC GGCGAGTGCG CTGGGGGTTG ATCCAGCTGA GCTTACAGGG 540 

CCGATGATTC CTGTCATTGC AAGCGGGATG CTTTGTATGG TGGCAGTTGC GTATGTGCTT 600 

25 GGAAAAGCGG AACGAAAGCG CCTTGGTGTG ATTGAACTGA AACAGCCAGC CAATGCCAAT 660 

GAACCGGCTG CTGCGGTTGA AGATGAGTGG AAGCGGCCGA AGCTTTGGTG GTTCAATTTA 720 

TTGTTAACGC TTTCTTTAAT AGGATGTTTA GTATCGGGTA AAGTCAGTTT AACCGTACTG 780 

30 TTTGTCATTG CGTTTTGTAT TGCGCTGATT GTGAATTATC CCAATCTCGA GCATCAGAGA 840 

CAGCGCATCG CGGCGCATTC CAGCAACGTG CTGGCTATCG GTTCAATGAT TTTTGCTGCG 900 

GGGGTGTTCA CGGGGATTTT GACAGGCACG AAAATGGTTG ATGAAATGGC GATCTCGCTC 960 

GTGTCCATGA TACCGGAACA AATGGGCGGA TTGATCCCGG CGATTGTTGC CTTAACAAGC 1020 

GGCATTTTCA CATTTTTGAT GCCGAATGAC GCGTATTTCT ACGGGGTGCT GCCGATTTTA 1080 

TCAGAAACAG CTGTCGCATA CGGTGTGGAT AAAGTGGAAA TTGCCAGAGC CTCTATTATC 1140 

GGCCAGCCGA TTCATATGCT GAGTCCGCTT GTGCCATCCA CTCATTTGCT TGTCGGACTC 1200 

GTCGGAGTTT CTATTGATGA CCATCAAAAA TTCGCATTGA AATGGGCGGT TCTCGCAGTG 1260 

ATCGTCATGA CGGCTATCGC TCTATTGATC GGTGCGATCT CTATTTCCGT ATGA 1314 
(2) INFORMATION FOR SEQ ID NO: 7: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 435 amino acids 
(6) TYPE: amino acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

Met Asp Val Leu Ala lie Leu Gly Phe Leu Met Met Leu Val Phe Met 
1 5 10 15 

Ala Leu lie Net Thr Lys Arg Leu Ser Val Leu Thr Ala Leu Val Leu 
20 25 30 

Thr Pro He Val Phe Ala Leu He Ala Gly Phe Gly Phe Thr Glu Val 
3 5 40 45 

Gly Asp Met Met He Ser Gly He Gin Gin val Ala Pro Thr Ala Val 
50 55 eo 

Met He Met Phe Ala He Leu Tyr Phe Gly He Met He Asp Thr Glv 
65 7 0 75 80 

Leu Phe Asp Pro Met Val Gly Lys He Leu Ser Met Val Lys Gly Asp 
85 90 95 

Pro Leu Lys He Val Val Gly Thr Ala Val Leu Thr Met Leu Val Ala 
100 105 no 

Leu Asp Gly Asp Gly Ser Thr Thr Tyr Met He Thr Thr Ser Ala Met 
115 120 125 

Leu Pro Leu Tyr Leu Leu Leu Gly He Arg Pro He He Leu Ala Gly 
130 135 140 

He Ala Gly Val Gly Met Gly He Met Asn Thr He Pro Trp Gly Gly 
145 150 155 160 

Ala Thr Pro Arg Ala Ala Ser Ala Leu Gly Val Asp Pro Ala Glu Leu 
165 no 175 

Thr Gly Pro Met He Pro Val He Ala Ser Gly Met Leu Cys Met Val 
160 185 190 

Ala Val Ala Tyr Val Leu Gly Lys Ala Glu Arg Lys Arg Leu Gly Val 
195 200 205 

He Glu Leu Lys Gin Pro Ala Aen Ala Asn Glu Pro Ala Ala Ala Val 
210 215 220 

Glu Asp Glu Trp Lys Arg Pro Lys Leu Trp Trp Phe Asn Leu Leu Leu 
225 230 235 240 

Thr Leu Ser Leu He Gly Cys Leu Val Ser Gly Lys Val Ser Leu Thr 
24 5 250 255 

Val Leu Phe Val He Ala Phe Cys He Ala Leu He Val Asn Tyr Pro 
260 265 270 

Asn Leu Glu His Gin Arg Gin Arg He Ala Ala His Ser Ser Asn Val 
275 280 285 

Leu Ala He Gly Ser Met He Phe Ala Ala Gly val Phe Thr Gly He 
290 295 300 

Leu Thr Gly Thr Lys Met Val Asp Glu Met Ala He Ser Leu val Ser 
305 310 315 3 20 

Met He Pro Glu Gin Met Gly Gly Leu He Pro Ala lie Val Ala Leu 
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325 330 335 

Thr Ser Gly He Phe Thr Phe Leu Met Pro Asn Asp Ala Tyr Phe Tyr 



Gly Val Leu Pro He Leu Ser Glu Thr Ala Val Ala Tyr Gly Val Asp 
355 360 365 

Lys val Glu He Ala Arg Ala Ser He He Gly Gin Pro He His Met 
370 375 380 

Leu Ser Pro Leu Val Pro Ser Thr His Leu Leu val Gly Leu Val Gly 
385 390 395 400 

val ser He Asp Asp His Gin Lys Phe Ala Leu Lys Trp Ala val Leu 
405 410 415 

Ala val He val Met Thr Ala He Ala Leu Leu He Gly Ala He Ser 
420 425 430 

lie Ser Val 
435 



20 (2) INFORMATION FOR SEQ ID NO: 8: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1296 base pairs 

(B) TYPE: nucleic acid 
(C> STRANDEDNESS: unknown 
(D) TOPOLOGY: unknown 



(iii molecule TYPE: other nucleic acid 
(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

AGGGGAGGTC ATATCATGCT TGCCATACTC GGTTTTCTGA TGATGATTGT CTTTATGTAC 60 

35 CTTATTATGT CTAACCGGCT TTCCGCTCTT ATTGCTTTGA TTGTCGTTCC TATTGTGTTT 120 

GCCCTGATCA GCGGATTTGG CAAAGATCTC GGCGAGATGA TGATTCAGGG CGTTACAGAC 180 

CTCGCCCCTA CCGGTATCAT GCTGTTATTC GCCATCCTGT ATTTCGGCAT TATGATTGAC 240 

TCAGGCCTGT TTGATCCTCT CATTGCCAAA ATCTT AT CGT TTGTCAAAGG AGATCCGTTT 300 

AAAATCGCCG TAGGCACAGC GGTTCTGACC ATGACCATTT CGCTGGACGG AGATGGGACA 360 

ACAACCTATA TGATTACCAT TCCAGCGATG CTGCCTCTCT ACAAACGGCT CGGCATGAAC 420 

CGTTTGGTGT TAGCGGGAAT AGCGATGCTT GGTTCGGGGG TTATGAATAT TATCCCGTGG 480 

GGCGAGCCGA CTGCGAGGGT TTTGGCTTCC TTAAAATTGG ACACGTCAGA GGTCTTTACA 54 0 

CCGCTGATTC CCGCTATGAT CGCCGGCATT CTCTGGGTGA TCGCCGTTGC TTATATCCTC 600 

GGAAAGAAAG ACCGGAAGCG GCTCGGCGTC ATTTCGATTG ATCACGCACC GTCTTCCGAC 660 

CCGGAGGCCG CACCGCTCAA GCGTCCCGCT CTTCAATGGT TTAACCTGCT GCTGACTGTC 720 

GCTCTGATGG CCGCACTGAT CACCAGCCTG CTGCCGCTCC CTGTTCTTTT TATGACTGCG 780 
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w 



TTCGCCGTTG CCTTGATGGT TAACTATCCA AATGTCAAAG AGCAGCAGAA ACGAATCTCG 840 

GCGCATGCGG CTAATGCGTT AAACGTTGTC TCAATGGTGT TTGCTGCGGG CATATTCACA 900 

GGCATTCTCT CCGGCACAAA AATGGTGGAT GCCATGGCGC ATTCTACACG TTCACTCATC 960 

CCTGATGCCA TGGGCCCGCA CCTGCCGTTG ATCACTGCGA TCGTCAGCAT GCCCTTCACC 1020 

TTTTTCATGT CGAATGACGC CTTTTACTTC GGTGTCCTTC CCATCATCGC CGAAGCCGCT 1080 

TCCGCTTACG GAATAGACGC CGCTGAAATC GGGAGGGCCT CCTTGCTGGG GCAGCCTGTG 1140 

CATCTGCTCA GCCCGCTTGT GCCTTCCACC TATCTATTGG TAGGAATGGC AGGCGTCAGC 1200 

TTTGGCGACC ATCAAAAATT CACTATTAAA TGGGCCGTGG GAACAACGAT TGTGATGACC 1260 

15 ATTGCGGCGC TTTTGATTGG GATTATTTCT TTCTAA !296 

(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 426 amino acids 
20 (B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

<ii) MOLECULE TYPE : protein 
2S (iii) HYPOTHETICAL: NO 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

30 Met Leu A la lie Leu Gly Phe Val Met Met lie Val Phe Met Tyr Leu 

1 5 10 15 

He Met Ser Asn Arg Leu Ser Ala Leu lie Ala Leu He Val Val Pro 
20 25 30 

35 He Val Phe Ala Leu He Ser Gly Phe Gly Lys Asp Leu Gly Glu Met 

35 40 45 

Met He Gin Gly Val Thr Asp Leu Ala Pro Thr Gly He Met Leu Leu 
50 55 60 

40 phe A3a Ile L °u Tyr Phe Gly He Met lie Asp Ser Gly Leu Phe Asp 

65 70 75 80 

Pro Leu Ile Ala Lys He Leu Ser Phe Val Lys Gly Asp Pro Phe Lys 
85 90 95 

Ile Ala Val Gly Thr Ala Val Leu Thr Met Thr He Ser Leu Asp Gly 
45 100 105 no 

Asp Gly Thr Thr Thr Tyr Met Ile Thr He Arg Ala Met Leu Pro Leu 
U5 120 125 

Tyr Lys Arg Leu Gly Met Asn Arg Leu Val Leu Ala Gly Ile Ala Met 
50 130 135 no 

Leu Gly ser Gly val Met Asn Ile He Pro Trp Gly Glu Pro Thr Ala 
145 150 155 160 
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Arg val Leu Ala Ser Leu Lys Leu Asp Thr Ser Glu val Phe Thr Pro 
165 170 175 

Leu lie Pro Ala Met He Ala Gly He Leu Trp val He Ala Val Ala 
180 185 190 

Tyr He Leu Gly Lys Lys Glu Arg Lys Arg Leu Gly Val He Ser Tie 
195 200 205 

Asp His Ala Pro Ser Ser Asp Pro Glu Ala Ala Pro Leu Lys Arg Pro 
210 215 220 

Ala Leu Gin Trp Phe Asn Leu Leu Leu Thr val Ala Leu Met Ala Ala 
225 230 235 240 

Leu He Thr Ser Leu Leu Pro Leu Pro Val Leu Phe Met Thr Ala Phe 
245 250 255 

Ala Val Ala Leu Met Val Asn Tyr Pro Asn Val Lys Glu Gin Gin Lys 
260 265 270 

Arg He Ser Ala His Ala Gly Asn Ala Leu Asn Val Val Ser Met Val 
275 280 285 

Phe Ala Ala Gly He Phe Thr Gly He Leu Ser Gly Thr Lys Met Val 
290 295 300 

Asp Ala Met Ala His Ser Thr Arg Ser Leu He Pro Asp Ala Met Gly 
305 310 315 320 

Pro His Leu Pro Leu He Thr Ala He Val Ser Met Pro Phe Thr Phe 
325 330 335 

Phe Met Ser Asn Asp Ala Phe Tyr Phe Gly Val Leu Pro He He Ala 
340 345 350 

Glu Ala Ala Ser Ala Tyr Gly He Asp Ala Ala Glu He Gly Arg Ala 

355 360 365 

Ser Leu Leu Gly Gin Pro Val His Leu Leu Ser Pro Leu Val Pro Ser 
370 375 380 

Thr Tyr Leu Leu Val Gly Met Ala Gly val Ser Phe Gly Asp His Gin 
385 390 395 400 

Lys Phe Thr He Lye Trp Ala Val Gly Thr Thr He Val Met Thr He 
405 410 415 

Ala Ala Leu Leu He Gly He lie Ser Phe 
420 425 

2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 426 amino acids 

(B) TYPE: amino acid 

tC) STRANDEDNESS: unknown 
(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 
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ixi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

Met Leu Ala He Leu Gly Phe Val Met Met He Val Phe Met Tyr Leu 
1 5 10 15 

He Met ser Aon Arg Leu Ser Ala Leu He Ala Leu He Val Val Pro 
20 25 30 

lie val Phe Ala Leu lie Ser Gly Phe Gly Lys Asp Leu Gly Glu Met 
35 40 45 

Met lie Gin Gly val Thr Asp Leu Ala Pro Thr Gly lie Met Leu Leu 
50 55 60 

Phe Ala lie Leu Tyr Phe Gly He Met He Asp Ser Gly Leu Phe Asp 
65 70 75 80 

Pro Leu He Ala Lys lie Leu Ser Phe Val Lys Gly Asp Pro Phe LyB 
85 90 95 

lie Ala val Gly Thr Ala Val Leu Thr Met Thr lie Ser Leu Asp Gly 
100 105 no 

Asp Gly Thr Thr Thr Tyr Met He Thr He Arg Ala Met Leu Pro Leu 
115 120 125 

Tyr Lys Arg Leu Gly Met Aen Arg Leu Val Leu Ala Gly He Ala Met 
130 135 140 

Leu Gly Ser Gly val Met Asn lie lie Pro Trp Gly Glu Pro Thr Ala 
"5 150 155 160 

Arg Val Leu Ala Ser Leu Lys Leu Asp Thr Ser Glu Val Phe Thr Pro 
165 170 175 

30 Leu lie Pro Ala Met lie Ala Gly lie Leu Trp val lie Ala val Ala 

180 185 190 

Tyr He Leu Gly Lys Lys Glu Arg Lys Arg Leu Gly Val He Ser He 
195 200 205 

35 Asp His Ala Pro Ser Ser Asp Pro Glu Ala Ala Pro Leu Lys Arg Pro 

210 215 220 

Ala Leu Gin Trp Phe Asn Leu Leu Leu Thr Val Ala Leu Met Ala Ala 
225 230 235 240 



JO 
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40 



Leu He Thr Ser Leu Leu Pro Leu Pro val Leu Phe Met Thr Ala Phe 
245 250 255 

Ala val Ala Leu Met Val Asn Tyr Pro Asn Val Lys Glu Gin Gin Lys 
260 265 270 

Arg He Ser Ala His Ala Gly Asn Ala Leu Asn val Val Ser Met val 

45 275 280 285 

Phe Ala Ala Gly lie Phe Thr Gly lie Leu Ser Gly Thr Lys Met Val 
290 295 300 

Asp Ala Met Ala His Ser Thr Arg Ser Leu lie Pro Asp Ala Met Gly 
SO 305 310 315 320 

Pre His Leu Pro Leu lie Thr Ala lie Val ser Met Pro Gly Thr Phe 
325 330 335 
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Phe Met Ser Asn Asp Ala Phe Tyr phe Gly val Leu Pro He He Ala 
340 345 350 

Glu Ala Aia ser Ala Tyr. Gly He Asp Ala Ala Glu He Gly Arg Ala 
355 360 365 

Ser Leu Leu Gly Gin Pro val His Leu Leu Ser Pro Leu val Pro Ser 
370 375 380 

Thr Tyr Leu Leu Val Gly Met Ala Gly Val Ser Phe Gly Asp His Gin 
385 390 395 400 

Lys Phe Thr He Lys Tip Ala Val Gly Thr Thr He val Met Thr He 
405 410 415 

Ala Ala Leu Leu He Gly He He Ser Phe 
15 420 425 

(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(a) length: 437 amino acids 
20 <B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY : unknown 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

25 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

30 Met Asp Val Leu Ala He Leu Gly Phe Leu Met Met Leu Val Phe Met 

15 10 15 

Ala Leu He Met Thr Lys Arg Leu Ser Val Leu Thr Ala Leu val Leu 
20 25 30 

35 Thr Pro He Val Phe Ala Leu He Ala Gly Phe Gly Phe Thr Glu Val 

35 40 45 

Gly Asp Met Met He Ser Gly He Gin Gin Val Ala Pro Thr Ala Val 
50 55 60 



40 



45 



Met He Met Phe Ala He Leu Tyr Phe Gly He Met He Asp Thr Gly 
65 70 75 80 

Leu Phe Asp Pro Met Val Gly Lys lie Leu Ser Met Val Lys Gly Asp 
85 90 95 

Pro Leu Lys He Val Val Gly Thr Ala Val Leu Thr Met Leu Val Ala 
100 105 110 

Leu Asp Gly Asp Cly Scr Thr Thr Tyr Met He Thr Thr Ser Ala Met 
115 120 125 

Leu Pro Leu Tyr Leu Leu Leu Gly He Arg Pro He He Leu Ala Gly 

SO 130 135 140 

He Aia Gly Val Gly Met Gly He Met Asn Thr He Pro Trp Gly Gly 
145 150 155 160 
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Ala Thr Pro Ser Gly Ser Ala Leu Gly val Asp Pro Ala Glu Leu Thr 
165 170 175 

Gly Pro Met He Pro val He Ala Ser Gly Met Leu Cys Met Val Ala 
180 185 190 

Val Ala Tyr Val Leu Gly Lys Ala Glu Arg Lys Arg Leu Gly Val He 
195 200 205 

Glu Leu Lys Gin Pro Ala Asn Ala Asn Glu Pro Ala Ala Ala Val Glu 
210 215 220 
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Asp Glu Trp Lys Pro Ala Lys Leu Trp Trp Phe Asn Leu Leu Leu Thr 
225 230 235 240 



Leu Ser Leu He Gly Cys Leu Val Ser Gly Lys Val Ser Leu Thr Val 
245 250 255 
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Leu Phe Val He Ala Phe Cys He Ala Leu He Val Asn Tyr Pro Asn 
260 265 270 



25 



Leu Glu His Gin Arg Gin Arg lie Ala Ala His Ser Ser Asn Val Leu 
275 280 285 

Ala He Gly Ser Met He Phe Ala Ala Gly Val Phe Thr Gly He Leu 
290 295 300 



30 



35 



Thr Gly Thr Lys Met Val Asp Glu Met Ala He Ser Leu Val Ser Met 
3 °5 310 315 320 

He Pro Glu Gin Met Gly Gly Leu He Pro Ala He Val Ala Leu Thr 
325 330 335 

Ser Gly He Phe Thr Phe Leu Met Pro Asn Asp Ala Tyr Phe Tyr Gly 
340 345 350 

val Leu Pro He Leu Ser Glu Thr Ala Val Ala Tyr Gly Val Asp Lys 
355 360 365 
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val Glu He Ala Arg Ala Ser He He Gly Gin Pro He His Met Leu 
370 375 380 



45 



Ser Pro Leu Val Pro Ser Thr His Leu Leu Val Gly Leu Val Gly Leu 
385 390 395 400 

Val Gly val Ser He Asp Asp His Gin Lys Phe Ala Leu Lys Trp Ala 

405 410 415 



50 



val Leu Ala Val He Val Met Thr Ala He Ala Leu Leu He Gly Ala 
420 425 430 

He Ser He Ser Val 
435 
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Claims 

1 . A recomfrnant nucleic acid comprising a nucleic acid sequence as shown in figure 1 or functional fragments or 
functional rienvatives thereof or a recombinant nucleic acid which is at least 65% homologous to the sequence 

£ shown in figure 1 

2. A recombinant nucleic acid consisting of a gene encoding a citrate transporter protein and having the nucleic acid 
sequence of figure 2 or functional fragments or functional derivatives thereof. 

10 3. A nucleic acid according to claim 1 or 2 which encodes a citrate transporter protein which transports free citrate. 

4. A nucleic aad according to daim 1 or 2 which encodes a citrate transporter protein which transports a metal-citrate 
complex 

75 5. A nucleic aad according to any one of claims 1 to 4 in which a nucleic acid sequence corresponding to th 
sequence as shown in figure 1 from position 7 to 71 , or from 72 to 77, or from 78 to 1 44, or from 1 45 to 182 t or from 
183 to 243. or from 244 to 272, or from 273 to 338, or from 339 to 398, or from 399 to 458, or from 459 to 536, or 
from 537 to 596. or from 597 to 704, or from 705 to 773, or from 774 to 848, or from 849 to 923, or from 924 to 965, 
or from 966 to 1037. or from 1038 to 1055, or from 1056 to 1 109. or from 1 1 10 to 1 148, or from 1 149 to 1202, or 

20 from 1203 to 1228, or from 1229 to 1316, has been modified. 

6. A vector comprising a nucleic acid according to any one of claims 1 to 5. 

7. A host cell comprising a vector according to claim 6. 

25 

8. A host cell comprising a nucleic acid according to any one of claims 1 to 5. 

9. A host cell expressing a protein encoded by a nucleic acid according to any one of claims 1 to 5. 
30 10. A citrate transporter protein obtainable by growing a host cell according to any one of claims 7 to 9. 

1 1 . A process for recovering metal comprising using a host cell according to any one of claims 7 to 9. 

12. A process for recovering metal comprising using a protein according to claim 10. 

35 

13. Use of a microorganism comprising a nucleotide sequence as shown in figure 1 or 2 or comprising a nucleotide 
sequence which is at least 65% homologous to the sequence shown in figure 1 or 2 for the industrial recovery of 
metal. 

40 
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I aggggaggtc atatcatgct tgccatactc ggttttgtga tgatgattgt 

M L A I L G F V M M I 

51 ctttatgtac cttattatgt ctaaccggct ttccgctctt attgctttcra 
VFMY LIM SNR LSAL I A L 

101 ttgtcgttcc tattgtgttt gccctgatca gcggatttgg caaagatctc 
I V V PIVF A L I SGF GKDL 

151 ggcgagatga tgactcaggg. cgttacagac ctcgccccta ccggtatcat 
GEM MIQ GVTD LAP TGI 

201 gccgttattc gccatcctgt atttcggcat tatgattgac tcaggcctgt 
MLLF AIL YFG I M I D SGL 

251 ttgatcctct cattgccaaa atcttatcgt ttgtcaaagg agatccgttt 
FDP LIAK ILS FVK GDPF 

301 aaaatcgccg taggcacagc ggttctgacc atgaccattt cgctggacgg 
KIA VGT AVLT MTI SLD 

351 agatgggaca acaacctata Cgattaccat tcgagcgatg ctgcctctct 
GDGT TTY MIT IRAM LPL 

401 acaaacggct cggcatgaac cgtttggtgt tagcgggaat agcgatgctt 
YKR LGMN RLV LAG IAML 

451 ggttcggggg ttatgaatat tatcccgtgg ggcgagccga ctgcgagggt 
GSG VMN I I P W GEP TAR 

501 tttggcttcc ttaaaattgg acacgccaga ggtctttaca ccgctgattc 
VLAS LKL DTS EVFT PLI 

551 ccgctatgat cgccggcatt ctctgggtga tcgccgttgc ttatatcctc 
PAM IAGI L W V IAV AYIL 

601 ggaaagaaag agcggaagcg gctcggcgtc atttcgattg atcacgcacc 
GKK ERK RLGV I S I DHA 

651 gtcttccgac ccggaggccg caccgctcaa gcgt.cccgct cttcaatggt 
P S 5 D PEA APL KRPA LQW 

701 traacctgct gctgactgtc gctctgatgg ccgcactgat caccagcctg 
FNL LLTV ALM A A L IT SL 

751 ctgccgctcc ctgttctttt tatgactgcg ttcgccgttg ccttgatggt 
LPL PVL FMTA FAV ALM 

801 taactatcca aatgtcaaag agcagcagaa acgaatctcg gcgcatgcgg 
V N Y P NVK EQQ KRIS AHA 

851 gcaatgcgtt aaacgttgtc tcaatggtgt ttgctgcggg catar.tcaca 
GNA LNVV SMV F A A GIFT 

901 ggcattctct ccggcacaaa aatggtggat gccatggcgc attctacacg 
GIL SGT KMVD AMA HST 
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951 ttcactcatc cctgatgcca Lgggcccgca cctgccgttg atcactgcga 
RSLI PDA MGP HLPL IT A 

1001 tcgtcagcat gcccttcacc tttttcatgt cgaatgacgc cttttacttc 
I V S MPFT FFM S N D A F Y F 



1101 cgctgaaatc gggagggcct cc ttgctggg gcagcctgtg catctgctca 
AAEI GRA SLL GQPV HLL 

1151 gcccgcttgt gccttccacc tatctattgg taggaatggc aggcgr.cagc 

SPL VPST Y L L VGM AGVS 

1201 tttggcgacc atcaaaaatt cactattaaa tgggccgtgg gaacaacgat 

FGD HQK FTIK WAV GTT 

1251 tgtgatgacc attgcggcgc ttr.tgattgg gattatttct ttctaa 
IVMT I A A L L I G I I S F - 



1051 



ggtgtccttc ccatcatcgc 
G V L P I I 



cgaagccgct tccgcttacg 
A E A A SAY 



gaatagacgc 
G I D 
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CitHbs 
CitMbs 



1 M--LAILGFVMMrVFMYLlMSNRLSALIALTVVPlVFALISGF--KDLGE 46 
1 MDVLAILGF1^1MLVFMAI*XMTByxSVLTALVLTPIVFALIACFX3FTEVGD 50 



Citllbs 
CitMbs 



»**»*»**»** ***** 



47 mTCK3VTDLAPTC;iMLLFAILyFGIMIDr.GLFDPLIAKILSFVKODPFKI 56 
51 MMISGTQQVAPTAVMrMFAII-VTGIMlDTGLFDPMVGKILSMVKGDPLKl 100 



Citllbs 
CitMbs 



> * * * * ****** 



****** 



97 AVGTAVLTTmSuDGDGTTTYMrTIRAMLPLYKRLGMmUiVliAGIAMIiGS 146 
101 WCTAVuTMLVAT J DG2>3STTYMITTSAMLPLVXLLGIRPtIlAGTAGVGM 150 



CitHbs 
CitMbs 



*#** ** * 



147 G VMN X I ?WG E PTARVL^AS LKLDTSEVFT PL. I P AMIAG I LWVI AVAYI LGK 196 
151 GIMOTIPWGGATPSG-SALGTOPAELTCPMIPVTASGMLa-4VAVAYVLGK 199 



CitHbs 
CitMbs 



* ******* 



197 KERKRI*GVIS I DHA PSS OPEAAPL KRPALQWFNLLLTVALMAALIT 242 

200 AERXULGVI ELKQPANANEPAAAVEDEWKPA KT.VJWFNLLLTLGLIGCI.VS 249 



CitHbs 
CitMbs 



* * * * 



* * ***** 



► * **** 



203 S L L. PL FVL KMT7v F A V ALMVNY PNVXEQ Q KRI S AHAGN ALNWSMVFAAG I 292 
250 GKV£LTVL!^IAfc'CIALIVNYPNLEIIQRQRTAAKSSNVX*AIGSMIFAAGV 299 



CitHbs 
CitMbs 



***** ****** *1 



r* * *** 



293 FTG1L3GTXMVDAMAI1STRSLIPDAKGPHLPLITAIVSMPGTFFMSNDAF 342 
300 FTGri.TGTKXVDBMATSLVSHIPEOMGGLIPArVALTSGIFTFLMPNDAY 349 



CitHbs 
CitMbs 



343 YFGVLPItAEAASAYGIDAAEICRASLLGQPVHLLSPLVPST^LLVGl'lAG 392 
350 FY G VI, ? I L S ET AVAYOVDK VEX AKAS 1 IGQ P I HMLS PL V PSTT ILLVG L VG 399 



CitHbs 
CitMbs 



393 V5FCDHQKFTIKWAVCTTIVMTIAALLICII5F-- 425 

400 LVGVG I DDIIQKFAL.KWAVLAVI VMTA1ALLIGAISTSV 437 
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